Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASCAT #1332

Merged
merged 68 commits into from
Mar 15, 2022
Merged

ASCAT #1332

merged 68 commits into from
Mar 15, 2022

Conversation

lassefolkersen
Copy link
Contributor

@lassefolkersen lassefolkersen commented Feb 21, 2022

PR checklist

First PR for #186 ASCAT. This is the first working copy that can process from a BAM-input to ascat output. There's still a lot of issues to work out:

  • requires downloading and unzipping two files from dropbox, according to ASCAT instructions. Would be preferable to have done more automatically (they are not even very big: loci-file and alleles-file). Edit: gave up on this. They are now just required inputs, with description on how to obtain given in meta.yml
  • definetly does not work with current nf-core test data, seems to need large bam files - tested with these two files (bam1 and bam2). Edit: gave up on this, and put in an empty test instead
  • Uses a mulled docker container (biocontainers/mulled-v2-c278c7398beb73294d78639a864352abef2931ce) - needs to update so singularity also works
  • Could work more on output, e.g. it's current a '.rdata' file with an R-list, but could easily reformat to json + csv or such for more general use. (edit: it now outputs as the previous sarek pipeline does)
  • Test if it works with cram/crai files out-of-the box (edit: it does)
  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd

@lassefolkersen lassefolkersen added the WIP Work in progress label Feb 21, 2022
@maxulysse
Copy link
Member

we have loci files (also loci_gc files) on AWS igenomes, I can probably talk to @ewels about adding all necessary files.
In sarek we have a r script used to run ascat that export the files: https://github.com/nf-core/sarek/blob/master/bin/run_ascat.r

@lassefolkersen
Copy link
Contributor Author

lassefolkersen commented Feb 21, 2022

we have loci files (also loci_gc files) on AWS igenomes, I can probably talk to @ewels about adding all necessary files.
In sarek we have a r script used to run ascat that export the files: https://github.com/nf-core/sarek/blob/master/bin/run_ascat.r

This is a good idea @maxulysse --- will you talk to @ewels ? It's four sets of (mandatory) dropbox-linked files listed in this short readme. The loci and allele files are mandatory (x2 for hg19 and hg38), and probably would be good to grab the gcfile and Replication timing sets as well. As long as they are somewhere better than dropbox, I think I can get them into the code auto-install style.

For the second part -- output like in sarek's run_ascat.r - I've started work on that already, in the latest commit. Not easy. Lot's of both mandatory and optional arguments to juggle. Getting there :-) (edit -- ok, it's ready now. It outputs like run_ascat.r)

@lassefolkersen
Copy link
Contributor Author

Status for today: Fixed args setups, such that args are given similarly to @FriederikeHanssen 's setup in ControlFreec. One test without, one test with. Should behave largely similar to the run_ascat.r setup in current sarek, which is the goal. Still need to fix (more) permanent access to dropbox files + probably need to have tests as stub since it needs large bams to complete.

Copy link
Contributor Author

@lassefolkersen lassefolkersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I'm done here. I can't figure out why the conda-test fails. I've seen it's a bit weird before, it's like it worked sometimes. Other than that, no loose ends I think

@FriederikeHanssen
Copy link
Contributor

not sure why I can't merge t master back in, but once that is done, this is good to be merged in my opinion

@lassefolkersen lassefolkersen removed the request for review from maxulysse March 15, 2022 10:17
@lassefolkersen lassefolkersen merged commit d6244b4 into nf-core:master Mar 15, 2022
@lassefolkersen
Copy link
Contributor Author

Oh right -- I think master branch of modules had a few updates. Those are merged in, fast-forward. I squashed and merged. Thanks!

@edmundmiller edmundmiller linked an issue Mar 18, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new module: ascat
5 participants